{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Jupyter Notebook to test Rayyan Loader"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install dependencies\n",
    "\n",
    "```bash\n",
    "pip install -r notebook-requirements.txt\n",
    "```\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configure OpenAI with your API key\n",
    "\n",
    "Make sure you have a file named `.env` in the same directory as this notebook, with the following contents:\n",
    "\n",
    "```\n",
    "OPENAI_API_KEY=<your key here>\n",
    "OPENAI_ORGANIZATION=<your organization here>\n",
    "```\n",
    "\n",
    "The organization is optional, but if you are part of multiple organizations, you can specify which one you want to use. Otherwise, the default organization will be used.\n",
    "\n",
    "Optionally, to enable NewRelic monitoring, add the following to your `.env` file:\n",
    "\n",
    "```\n",
    "NEW_RELIC_APP_NAME=<your app name here>\n",
    "NEW_RELIC_LICENSE_KEY=<your key here>\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import sys\n",
    "import logging\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "logging.basicConfig(stream=sys.stderr, level=logging.INFO)\n",
    "logger = logging.getLogger(__name__)\n",
    "\n",
    "load_dotenv()  # take environment variables from .env.\n",
    "\n",
    "logger.debug(f\"NewRelic application: {os.getenv('NEW_RELIC_APP_NAME')}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load a Rayyan review into LLama Index\n",
    "\n",
    "Make sure to have a Rayyan credentials file in `rayyan-creds.json`.\n",
    "Check the [Rayyan SDK](https://github.com/rayyansys/rayyan-python-sdk) for more details.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:root:Signed in successfully to Rayyan as: Hossam Hammady!\n",
      "INFO:root:Working on review: 'PICO on-demand' with 900 total articles.\n",
      "INFO:root:Fetching articles from Rayyan...\n",
      "100%|██████████| 900/900 [00:05<00:00, 166.16it/s]\n",
      "INFO:__main__:Indexing articles...\n",
      "INFO:__main__:Done indexing articles in 42.46 seconds.\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "from time import time\n",
    "\n",
    "from nr_openai_observability import monitor\n",
    "from llama_index import VectorStoreIndex, download_loader\n",
    "\n",
    "if os.getenv(\"NEW_RELIC_APP_NAME\") and os.getenv(\"NEW_RELIC_LICENSE_KEY\"):\n",
    "    monitor.initialization(application_name=os.getenv(\"NEW_RELIC_APP_NAME\"))\n",
    "\n",
    "# Uncomment to download the loader from another repository\n",
    "# RayyanReader = download_loader(\"RayyanReader\", loader_hub_url=\"https://raw.githubusercontent.com/rayyansys/llama-hub/rayyan-loader/llama_hub\")\n",
    "RayyanReader = download_loader(\"RayyanReader\")\n",
    "loader = RayyanReader(credentials_path=\"rayyan-creds.json\")\n",
    "\n",
    "# documents = loader.load_data(review_id=746345, filters={\"search[value]\": \"outcome\"})\n",
    "documents = loader.load_data(review_id=746345)\n",
    "logger.info(\"Indexing articles...\")\n",
    "t1 = time()\n",
    "review_index = VectorStoreIndex.from_documents(documents)\n",
    "t2 = time()\n",
    "logger.info(f\"Done indexing articles in {t2 - t1:.2f} seconds.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Query LLama Index about the review data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "❓ Query 1/5: What are the most used interventions?\n",
      "Waiting for response...\n",
      "🤖 The most used interventions mentioned in the context are the Healthy Choices motivational interviewing intervention and the PlayForward: Elm City Stories videogame intervention. These interventions were developed to target multiple risk behaviors among HIV-positive youth and young minority teens, respectively. The Healthy Choices intervention focused on reducing alcohol and marijuana use, while the PlayForward intervention aimed to teach knowledge and skills for preventing HIV infection.\n",
      "Relevant articles:\n",
      "- [540581301] Alcohol and marijuana use outcomes in the healthy choices motivational interviewing intervention for HIV-positive youth\n",
      "- [540581177] A videogame intervention for risk reduction and prevention in young minority teens\n",
      "\n",
      "❓ Query 2/5: What is the most common population?\n",
      "Waiting for response...\n",
      "🤖 The most common population in the given context is fisher-folk communities (FFC) in Uganda.\n",
      "Relevant articles:\n",
      "- [566755104] High HIV-1 prevalence, risk behaviours, and willingness to participate in HIV vaccine trials in fishing communities on Lake Victoria, Uganda\n",
      "- [540581159] A model for provision of ENT health care service at primary and secondary hospital level in a developing country\n",
      "\n",
      "❓ Query 3/5: Are there studies about children?\n",
      "Waiting for response...\n",
      "🤖 Yes, there are studies mentioned in the context information that specifically focus on children. One study evaluates the age ranges and age-subgroup analyses in pediatric randomized clinical trials and meta-analyses. Another study assesses the quality of outpatient pediatric care provided by township and village doctors in rural Hebei, China. Both studies provide insights into various aspects of children's health and healthcare.\n",
      "Relevant articles:\n",
      "- [540583688] Empirical evaluation of age groups and age-subgroup analyses in pediatric randomized trials and pediatric meta-analyses\n",
      "- [540582609] Care-seeking and quality of care for outpatient sick children in rural Hebei, China: A cross-sectional study\n",
      "\n",
      "❓ Query 4/5: Do we have any studies about COVID-19?\n",
      "Waiting for response...\n",
      "🤖 There is no information provided in the context about studies specifically related to COVID-19.\n",
      "Relevant articles:\n",
      "- [540583712] Epidemiological analysis of critically ill adult patients with pandemic influenza A(H1N1) in South Korea\n",
      "- [540582543] Asthma and severity of the 2009 novel H1N1 influenza: A case-control study\n",
      "\n",
      "❓ Query 5/5: Are there any multi-center randomized clinical trials?\n",
      "Waiting for response...\n",
      "🤖 Yes, there are multi-center randomized clinical trials mentioned in the context information. One study protocol describes a cluster randomized controlled trial that will involve 16 community emergency departments.\n",
      "Relevant articles:\n",
      "- [540583688] Empirical evaluation of age groups and age-subgroup analyses in pediatric randomized trials and pediatric meta-analyses\n",
      "- [540582574] Best strategies to implement clinical pathways in an emergency department setting: study protocol for a cluster randomized controlled trial\n",
      "\n"
     ]
    }
   ],
   "source": [
    "query_engine = review_index.as_query_engine()\n",
    "prompts = [\n",
    "    \"What are the most used interventions?\",\n",
    "    \"What is the most common population?\",\n",
    "    \"Are there studies about children?\",\n",
    "    \"Do we have any studies about COVID-19?\",\n",
    "    \"Are there any multi-center randomized clinical trials?\",\n",
    "]\n",
    "for idx, prompt in enumerate(prompts):\n",
    "    print(f\"❓ Query {idx + 1}/{len(prompts)}: {prompt}\")\n",
    "    print(\"Waiting for response...\")\n",
    "    response = query_engine.query(prompt)\n",
    "    print(f\"🤖 {response.response}\")\n",
    "    print(\"Relevant articles:\")\n",
    "    for article in response.metadata.values():\n",
    "        print(f\"- [{article['id']}] {article['title']}\")\n",
    "    print()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
